ZERO BIASING AND GROWTH PROCESSES 



JASON FULMAN AND LARRY GOLDSTEIN 

Abstract. The tools of zero biasing are adapted to yield a general re- 
sult suitable for analyzing the behavior of certain growth processes. The 
main theorem is applied to prove central limit theorems, with explicit 
error terms in the metric, for certain statistics of the Jack measure 
on partitions and for the number of balls drawn in a Polya-Eggenberger 
urn process. 



1. Introduction 

Zero biasing for the normal approximation of a random variable W using 
Stein's method was introduced in Goldstein and Reinert [GRj . One instance 
in which the zero bias method may be applied is for W for which a Stein pair 
W, W may be constructed, that is, for W that may be coupled to a variable 
W such that W, W is exchangeable and satisfies E{W'\W) = (1 - a)W for 
some a £ (0,1]. After giving a brief review of these methods in Section [21 
in Section [3] we provide a general result allowing one to apply zero biasing 
when the statistic W of interest is formed by certain growth processes and 
can be coupled in a Stein pair. 

Section m studies a certain statistic Wa under the Jack^ measure on parti- 
tions. We defer precise definitions to Section HI but for now mention that is of 
interest to study statistical properties of Jacko measure. The case a = 1 cor- 
responds to the actively studied Plancherel measure of the symmetric group. 
The surveys [AlD] .[De]. [02] and the seminal papers [B00j .[J]. [01j indicate 
how the Plancherel measure of the symmetric group is a discrete analog of 
random matrix theory, and describe its importance in representation theory 
and geometry. Okounkov |02| notes that the study of Jacko, measure is 
an important open problem, about which relatively little is known. It is a 
discrete analog of Dyson's f3 ensembles from random matrix theory [BOlj . 

The particular statistic Wa under Jack measure which we study is of 
interest for several reasons. When a = 1 it reduces to the character ratio 
of transpositions under Plancherel measure, or equivalently to the spectrum 
of the random transposition walk. Also by Corollary 1 of [DHJ, there is 
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a natural random walk on perfect matchings of the complete graph on n 
vertices, whose eigenvalues are precisely ^^^^ ^ occurring with multiplicity 

-y/n(n-l) 

proportional to the Jack2 measure of A. The proofs to date of central limit 
theorem for Wq, range from combinatorial ones using the method of moments 
in [Klj. fH^, [Sn], and the use Stein's method, which produces an error term 
(but with no explicit constant) in the Kolomogorov metric [FT], |F2] . |SS| . 
Our contribution is to prove a central limit theorem in the metric, with 
a small explicit constant. 

Section [5] applies the main result of Section [3] to study a growth process 
arising from the Polya-Eggenberger urn model. More precisely, imagine an 
urn Ua,b containing A white balls and B black balls. At each time step one 
ball is drawn, and returned to the urn along with m balls of the same color. 
This is one of the simplest urn models, discussed in detail in the textbooks 
|JK| and [M] • We obtain a central limit theorem with explicit error term for 
the number of white balls drawn after n steps. While [JK] and [Mj contain 
many useful results and pointers to the literature, including some central 
limit theorems in more general settings, to the best of our knowledge the 
literature does not contain results that provide such error terms for this 
problem. 



2. Stein's method and zero biasing 

Stein's lemma [SI] states that a random variable Z has the mean zero 
normal distribution AA(0, cr^) if and only if 

(1) a^EfiZ) = E[Zf{Z)] 

for all absolutely continuous functions / for which these expectations exist. 
Motivated by this characterization, for a mean zero, variance cr^ random 
variable W and a given function h on which to test the difference between 
EhiyV) and Nh = Eh{Z), Stein |S1| considered the differential equation 

(2) (r'^f'iw) - wf{w) = hiw) - Nh. 

For the unique bounded solution /i of ([2]), one can evaluate the required 
difference by substituting W for w and taking expectation, to yield 

Eh{W) -Nh = E[a'^f'{W) - Wf{W)]. 

Though it may not be immediately clear why the right hand side may be 
simpler to evaluate than the left, a variety of techniques have been developed 
to handle various situations. For instance, the exchangeable pair technique, 
from [S2] handles the expectation of the right hand side when the given 
random variable W can be coupled to W so that (W, W') is an a-Stein pair, 
that is, an exchangeable pair that satisfies 



(3) 



E{W'\W) = (1 - a)W for some a G (0, 1). 
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Other techniques for handhng the Stein equation are discussed in detail in 
[Clj and in the references therein, but of particular relevance here is the 
zero bias coupling, which we now review. 

Though the mean zero normal is the unique distribution satisfying ([T]) , one 
can ask whether a given variable satisfies a like identity of it own. Indeed, 
it is shown in |GRj that for every mean zero, variance cr^ random variable 
X, there exists a distribution for a random variable X* , termed the X-zero 
biased distribution, such that 

(4) a^Ef'iX*) = E[Xf{X)] 

for all absolutely continuous functions / for which these expectations exist. 
The mapping of C{X), the distribution of X, to C{X*), is known as the zero 
bias transformation. In particular. Stein's lemma ([T]) can be rephrased as 
the statement that the mean zero normal M{0, u^) is the unique fixed point 
of the zero bias transformation characterized by (j4]). 

Heuristically, then, if the transformation has a fixed point at the mean 
zero normal, then an approximate fixed point should be approximately nor- 
mal. This heuristic has been made precise for a variety of examples in [GRj, 
|G1| . |G2j . |G3j and |G4j (see also |C1| ) in order to yield bounds in both the 
Kolmogorov and metric. For the latter, the following result from |G4| is 
often useful; we use || • ||i to denote the metric. 

Theorem 2.1. // the mean zero, variance 1 random variable W can he 
coupled to W* having the W-zero bias distribution, then 

\\C{W) - C{Z)\\i < 2E\W* - W\ 

where Z is a standard normal variable. 

Hence, to obtain bounds, the question reduces to finding a way to 
couple W and W* . Lemma 12.21 below of [GRj , noting here that the result 
holds also for a = 1, shows how the construction of a variable W* with the 
Vl^-zero bias distribution can be achieved with the help of the distribution 
dF{w, w') of a Stein pair. First, it can easily be shown from ^ that if 
W, W is an a-Stein pair possessing second moments then 

(5) EI^ = and ¥.{W' - Wf = 2oVar(VF), 
so in particular, 

(6) dF\w, w') = ~ "^^^ Fiw, w') 

2a 

is a bivariate distribution. 

Lemma 2.2. IfW^,W^ have distribution where F{w,w') is the joint 
distribution of an a-Stein pair, and U is a uniformly distributed variable, 
independent of , then 

w* = UW^ + (1 - U)W^ 

has the W-zero bias distribution. 
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In particular, if W and W\ can be constructed on a common space, 
then W and W* can be also. 

We remark that a number of results are available when (W, W') is only 
an approximate Stein pair, that is, an exchangeable pair that satisfies the 
linearity condition ^ with a remainder, see for instance |RRj . and |Clj . 
Correspondingly, here we expect the conclusions of Theorems 12.11 and 13 . 1 1 to 
hold for approximate Stein pairs by including in the bounds the additional 
terms that arise from such remainders. 

In what follows we study processes for which the random variable W of 
interest can be written as the sum V + T, where is a function of a variable 
r determined by the process run to a penultimate state, and T a function 
of running the process for one additional step. In our examples, given r, a 
Stein pair (W, W) = {V + T,V + T') can be constructed by running two 
copies of the last step of chain, forming T and T' conditionally independent 
given r. 

In such cases a pair of random variables with distribution ([6]) can be 
similarly constructed by forming {W'^ , W"^) = + T^n , V° + T^n ) for V° 
and T^a , T^n sampled by biasing the distributions of V and T, T' in a certain 
way. Our first application of Theorem 13.11 to Jack measure, is particularly 
simple since the biasing factor to form the V° distribution from that of V 
is unity, and we may therefore take V = For our second example, the 
Polya-Eggenberger urn, we will see that biasing draws from the urn V(a,b in 
our process results in the urn ly(A+m,B+m- 

3. General Result 

The purpose of this section is to prove the following theorem. 

Theorem 3.1. Consider a bivariate distribution C{t,T) on a random object 
T and random variable T , and a r measurable random variable V = Vr such 
that sampling t, and then, given t, sampling T and T' independently from 
the conditional distribution C{T\t), the random variables 

(7) W = V + T and W' = V + T' 
have variance one and are an a-Stein pair. Denoting 

(8) E{T\T)=fir and E{{T - firflr) = a"^, 

and the distribution of t by dFij), the measure F°{t) specified by 

2 

(9) dF°(r) = ^dF(r) 

a 

is a probability measure, and for any coupling of t to r° with distribution 
1^, we have 

\\c{w) - c{z)\u 

(10) < 2E\{VrO -V) + iflr^ - flr)\ + 2E\T - /i^l + _^^y 
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When Hr equals zero and is constant almost surely, then 

E|r3| 



(11) ||£(W^) -£(Z)||i < 2E|r| + 



Var{T) ' 



Proof. First consider the case where /i,- = a.s.. Since conditional on r the 
pair T and T' are independent, we have E[T'T|t] = E[T'|t]E[T|t] = 0, and 
therefore, from ([7]) and dH), 

(12) E((VF' - wf\T) = E((r' - r)V) = 2(T^ 

Taking expectation and applying ([5]), we have that 

(13) Kal = a, 

verifying that dF^^r) is a probability measure. 

By construction, the joint distribution of (T, T' , r) is, with some abuse of 
notation, given by 

dF{t,t',T) = dF{t'\T)dF{t\T)dF{T), 

and therefore the pair (W, W) has distribution 

(14) dF{w,w')= f dF{t'\T)dF{t\T)dF{T), 

J T,t,t' ■.V+t=W,V+t' =W' 

where v = Vr- By Lemma 12.2^ with U an independent uniform random 
variable on [0, 1], 

w* = UW^ + (1 - U)W^ 
has the W-zeio bias distribution when {W\ W^) has distribution given by 

dF\w,w') = dF{w,w'). 
2a 

For any fixed r let F{t\T) denote the conditional distribution of T given 
r. By p^ . for every r the measure 



(15) dF^it,t') = ^J-J)ldF{t'\T)dF{t\T), 



is a bivariate probability distribution. 
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Now using (fTil) . (fT3|) and (fT5ll 

dF\w,w') 
_ {w' - w)'^ 



2a 



dF{t'\T)dF{t\T)dF{T) 

T,t,t' ■.V+t=W ,V+t' =w' 

^^^^^^-^dF{t'\T)dF{t\T)dF{T) 
t,t':v+t=w,v+t'=w' 2a 

0-2 (t' - 1)"^ 

_ ' dF{t'\T)dF{t\T)dF{T) 

T,t,t' ■.v+t=w ,v+t' =w' 



(16) = / ( / (l^dF{t'\T)dF{t\T)) ^dF{r) 

Jt \Jt,t':v+t=w,v+t'=w' ^C'r / " 

= [( [ dF^{t',t))dF^{r). 

Jt \Jt,t':v+t=w,v+t'=w' J 

The factorization in the integral indicates that given r° with distribution 
dF^{T), the pair (1^"'^,VF^) can be generated by sampHng T^uiT^u from 
dF^n {f, t), and then setting 

= Vro + T^a and = + T^u , 
where Vr° is the value of y on r°. In particular, letting 

(17) = UTIu + (1 - U)tIu , 
we have that 

W* = U{VrO + T^a ) + (1 - U)iVrn + T^a ) = Vr^ + 

has the T4^-zero biased distribution. 

For a fixed r, let and denote independent copies of a random variable 
with distribution dF{t\T). Clearly Tr and are exchangeable, and as ^.-r = 
0, we have IE(r) = E (E(T|r)) = E/i^ = and therefore E(r'|r) = E(r') = 
0. Hence (T, T') is a 1-Stein pair. In view of (jlSp . Lemma 12.21 vields that 
when Tr,Tr have distribution Fr{t,t') and U is an independent uniform 
random variable, 

(18) t; = utI + (1 - u)tI 

has the T,-zero biased distribution. 
As E(r) = 0, by (USD we obtain 

a = Eo-^ = E (E(tV)) = IE(r2) = Var(r). 

Comparing (jl7p and (jlSp . we see that the distribution C{T'^°) is the mix- 
ture of the distributions C{T*) with mixing measure cr^/Var(T), by (fT6l) . 
Therefore, by Theorem 2.1 of [G3], has the T-zero bias distribution. 
Applying the zero bias identity ([4]) with /(x) = (l/2)a;^sign(j;), we have 

E = LI. 

' ' 2Var T 
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Now, with r and t° the given couphng, letting V = Vr and T be sampled 
from £(r|T), setting {W, W*) = {V + T,V° + ) yields a coupling of W 
and W* on the same space, satisfying 

¥.\W* - W\ 

= ^Vrn -V +T''° -T\ 

< E\VrO -V\+E\T\+E\T''°\ 

E\T^\ 



E\VrO - V\ + E\T\ + 



2Var(r) 
Theorem 12.11 now yields 

E\T^\ 



(19) \\C{W) - £(Z)||i < 2E\VrO -V\+ 2E\T\ + ^^^^^^ . 

When cr^ is constant we have that (iF°(r) = dF{T), and hence may let 
T° = r; taking Kn = V in ([19]) now yields (fTT|) . 

To obtain the result for general fir, we reduce to the case = by 
writing 

{W, W) = {V + T,V + T') = {{V + fir) + {T- fir), {V + fir) + (T' - fir)). 

Replacing V and T in (fT9]) by 1/+/^,- and T—fXr, respectively, yields p^ . □ 

4. The Jack measure 

In this section we apply Theorem 13.11 to study a property of the Jacko, 
measure on the set of partitions of size n. For q > the Jacka measure 
chooses a partition A of size n with probability 

(20) Jack, (A) = Y[^^^{aa{x) + l{x) + l){aa{x) + l{x) + a) ' 

where in the product over all boxes x in the partition A, a{x) denotes the 
number of boxes in the same row of x and to the right of x (the "arm" of x), 
and l{x) denotes the number of boxes in the same column of x and below x 
(the "leg" of x). For example one calculates that the partition 

A= □ □ □ 
□ □ 

of 5 has Jack, measure 



Jacket (A) 



60a2 



(2a + 2)(3a + l)(a + 2)(2a + l)(a + 1) ' 

With A having the Jack, distribution, we apply the theory of Section [3] 
to prove an explicit Li normal approximation bound for the statistic 
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where Ca{x) denotes the "a-content" of x, defined as 

Ca{x) = a(column number of x — 1) — (row number of x — 1). 

In the diagram below representing a partition of 7, each box is filled with 
its a-content: 








a 




-1 




a — 1 


-2 





2 a 3 a 



In the Kolmogorov metric, the paper |F1| proved an 0{n~^^^) error term 
for the normal approximation of Wa', this rate was sharpened in |F4j using 
martingales to 0{n^^^/'^^'^^) for any e > and in [F3\ to 0{n~^^'^) using 
Bolthausen's inductive approach to Stein's method, but without an explicit 
constant. The text [HOj proves a central limit theorem, with no error term, 
for Wa using quantum probability. Here we give an explicit Li bound to 
the normal with small constants. 

To obtain our bound we construct an exchangeable pair using Kerov's 
growth process for generating a random partition distributed according to 
Jacko; measure. Given a box x in the diagram of A, again letting a(x) and 
l{x) denote the arm and leg of x respectively, set 

c\{a) = JJ(aa(a;) + l{x) + 1), c\{a) = JJ(aa(x) + l{x) + a) 

and, for r a partition obtained from A by removing a single corner box, 

, T-r {aax{x) + l\{x) + 1) (aar(x) + Irjx) + a) 

"^^/^^"^ " {aaxix) + lx{x) + a) (aa,(x) + U{x) + 1) 

where Cxjj- is the union of columns of A that intersect A — r and -Ra/t is the 
union of rows of A that intersect A — r. 

The state of Kerov's growth process at times n = 1,2,... is a partition 
of size n, starting at time one with the unique partition of 1. If at stage 
n — 1 the state of the process is the partition r, a transition to the partition 
A occurs with probability 

As shown in |K2j , [F4j , if r is chosen from the Jackc measure on partitions 
of size n — 1, then transitioning according to this rule results in a partition 
A of n distributed according to Jack^ measure. 

We now present an bound for the normal approximation of Wa- 

Theorem 4.1. Let 



(21) Wa{X) 



a 
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and let Wa be the value o/ Wq,(A) when A has the Jack^ measure distribution 
for some a > 0. Then for Z a standard normal random variable, 




(22) waw^) - ciz)\\, <\r- 1 2 + W2 + 



n — 1 I 

Proof. First we show ()22p holds for aU q > 1. Constructing r from the 
Jack measure on partitions of size n — 1 and then taking one step in Kerov's 
growth process yields A with the Jack measure on partitions of size n, and 
we may write 

Wc, = V + T 

where 

V = ^^^rCai-) ^^c.(A/r) 



a 



[2) M2) 



and Ca{X/T) denotes the a-content of the box added to r to form A. 

It is shown in [Fl| that constructing A' by taking another step in Kerov's 
growth process from r, independently of A/r given r, and then forming 
from A' as W is formed from A, results in exchangeable variables Wa, that 
satisfy ([3]) with a = 2/n. Hence, ([7j) of Theorem 13.11 is satisfied. Corollary 
5.3 of [FT] gives that Var{W) = 1. 

From Section 3 of [F3j , one recalls the following three facts: 

(1) E[T\t] = for ah r. 

(2) E[T^\t] = I for ah r. 



(3) E[T^^^ = '^'(2)+"("-^)'(^-i)+3"'("2') 



As V is measurable with respect to the u-algebra generated by r, condi- 
tion ([S]) is satisfied. From properties ([T]) and ^ above we have, respectively, 
that fi-r = and o"^ is a constant, almost surely. Hence the bound (llip of 
Theorem 13.11 holds. 

Applying the Cauchy-Schwarz inequality gives that E\T\ < VeT^ = 
Y^2/n, accounting for the first term in the bound. From property ([3]), now 
applying a > 1, we have 



E[T^] < 



D + 3(V) 



a(n - 1) 8 Aa 



[2) 



^^ n^(n — 1) 



The Cauchy-Schwarz inequality gives that E'lT^I < y/E[T^]E[r^, and prop- 
erties ([1]) and (j2]) give Var(T) = 2/n, yielding the final term in the bound 
(I22|) . Thus the result is shown when a > 1. 

To obtain a bound for all a > note first that when taking the transpose 
A* of a partition A the roles of the arms a(x) and legs l{x) become inter- 
changed; hence, letting Aq be a partition with the Jack^ distribution, from 



10 



JASON FULMAN AND LARRY GOLDSTEIN 



(f20l) . for all a > we have 

£(A«) = £(A*/J. 
Next, as Wa{X) = -VFi/«(A*) for all A, and C{Z) = C{-Z), 

imw^ix^)) - C{Z)\U 

= \\C{-Wy^{Xi)) - C{Z)\\, 
= ||£(-T^i/,(Ai/,))-£(-Z)||i 
= \\C{Wy^{Xy^)) - C{Z)\U. 

Hence, as the bound (j22p holds for all a > 1, it holds for all a > 0. □ 



5. Polya-Eggenberger urn model 

For m,n,A,B > fixed integers, we define a probability distribution on 
the set {0, 1, • • • , n} by 

fn\ {A/rn)k{B lrn]ri-k 
23 Mn,A,B{k) = J ■ 

\kj [A/m + B/m)n 

Unless clarity demands it, we will simply write Mn{k) for Mn^A,B{k)- Here 
Xr = x{x + 1) • • • (x + r — 1), the rising factorial, where we set xq=1. 

It is well known |K3| . [M] . [JKj that the distribution Mn{k) can be 
achieved in the following way. Imagine an urn Ua^b that initially has A 
white and B black balls. At each time step, one ball is drawn uniformly 
from the urn and then returned back along with m balls of the same color. 
If Sn is the number of white balls drawn in the first n draws, then 

P{Sn = k) = Mn{k) for /c = 0, 1, . . . , n. 

We note that when Sn = k the urn V(a,b contains A + km white balls. 

In this section we prove the following normal approximation to the 
distribution of Sn, properly standardized. 

Theorem 5.1. For n & N let Sn be the number of white balls added to Ua,b 
after n time steps, and set 



(24) Wn 



' {A + B + m)n 
AB{A + B + nm) 



■^_{A + B)Sn 
n 



Then Wn has mean zero and variance 1, and for Z a standard normal ran- 
dom variable, for n > {A + B + m) /2m 



\\C{Wn+l) - C{Z)\\ 



^ / Amn ^A'^ + 6AB + B^\ / {A + B + m) 



^A + B + m AB J V AB{A + B + nm + m){n + l) 
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while for n < {A + B + m)/2m, 

mWn+l) - C{Z)\\, 



< 



+ 8AB + B'^\ / (A + B + m) 



AB J V AB{A + B + nm + m){n + l)' 

From Theorem 3.2 of [M], we know with A, B, m fixed and n — )• oo, 

Sn/n ^dl3{A/m,B/m), 

that is, the fraction of white bahs drawn converges to the Beta distribution 
with parameters A/m, B /m. In particular, the hmiting value of the bound as 
n — )• oo, giving an bound between the standardized Beta distribution and 
the normal, is 4 Tn{A + B + m)/AB; for, say A = B, the bound specializes 
to \-sJm(2A + m)/A, which tends to zero at rate \l\fA if m is fixed and A 
grows. 

For what follows it is useful to relate the distribution Mn{k) to up and 
down chains. On the set Tn = {{n, k) : < < n}, placing directed edges 
from (re — 1, k) to (n, k) and to (n, k+l) results in what is known as known as 
Pascal's lattice [K3| . It is convenient to define d{{n,k)) = (^), the number 
of paths from (0, 0) to (re, k). More generally, one defines d((re, k)/{m,j)) to 
be the number of paths from {m,j) to {n,k); this is ["j^)- 

We define an "up" chain that transitions from (re. A;) to (re + with 
probability {B+nm—km)/{A+B+nrn) and to (re+1, A;+l) with probability 
{A+km) / {A+ B +nrn) . We also define a "down" chain that transitions from 
(re, k) to (re — 1, /c — 1) with probability k/n and to (re — 1, h) with probability 
1 — {k/n). One easily checks that if (re — 1,/c) is distributed according 
to M„_i, then applying the up chain gives an element of F^ distributed 
according to M„. Similarly, if (re. A;) is distributed according to M^, one 
checks that applying the down chain gives an element of F„_i distributed 
according to M„_i. 

We denote the up chain from F^.i to F^ by Un-i and the down chain 
from F„ to F„_i by A straightforward computation yields that 

(25) Dn+lUn = CnUn-lDn + (1 - C„)/„ 



with c„ = ^}'^i)^A+B+^) ' ^° ^^^^ °^ ^■'^s in force. 

The following lemma shows how to use the up and down chains to con- 
struct a Stein pair, that is, a pair of exchangeable random variables satisfying 



Lemma 5.2. Let Wn he given by ^24\ ) with Sn the number of white balls 
added to Ua,b after re time steps. Now construct S'^ by transitioning down 
using Dn and then up using Un-i, and let be given by [24\ ) with Sn 
replaced by S'^. Then Wn, Wn is an On-Stein pair with 

A + B 
n(A + S + rem — m) 



12 



JASON FULMAN AND LARRY GOLDSTEIN 



Proof. By Theorem 4.3 of [F5j and equation (I25p . a left eigenvector with 
eigenvalue 1 — On is obtained by applying the operator C7"~^ to (1, 0) — (1, 1). 
From the general theory of down-up chains (see |F5j ). one has that 



Similarly, 



U-~'(10) - V ^n(fcK(n,fc)/(l,0)) 

k=0 ^k)^ 



jjn-l(, 1^ _ f^ M„(fc)d((n,fc)/(l,l)) 



fc=0 ^fcJ^ 



Since Un-iDn is a reversible Markov chain with stationary distribution 
M„, its right eigenvectors are obtained from its left eigenvectors by dividing 
by Mn- Thus 

ru'){A + B) _ iT-M^ + B) ^ A + B_ r _ k{A + B) 
il)B OA AB [ 

is a right eigenvector of Un-iDn with eigenvalue fl - n{A+B+nm-m) ) ■ 



is a scalar multiple of ^rw" 



A 



k{A+B) 



the result follows. □ 



The next goal is to compute the mean and variance of Wn given by (124p 
with Sn the number of white balls drawn in the first n draws. Clearly for 
all n > 1 one may write 

•Sn = lo + • • • + ln-1 

where Ij = 1 if a white ball is drawn at time j, and Ij = otherwise. The 
next lemma computes the mean and covariance of the indicators Ij . 

Lemma 5.3. For j = 0, . . . ,n — 1, let Ij denote the indicator that a white 
ball is drawn from Ua,b o,t time j. Then 

(1) E[lj] = ^ for alljG{0,...,n-l}. 

(2) mhlj] = forallO<h<j<n-l 

(3) E[Sn] = 

Proof. It is classical and elementary that the indicators lj,j = 0,...,n — l 
are an exchangeable sequence (see pK] or [M] for a proof). Thus E[lj] is the 
probability that the first ball drawn is white, and is the probability 

that the first two balls drawn are white. These observations, and linearity 
of expectation, yields the lemma. □ 
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With the help of Lemma 15.3^ we now compute the mean and variance of 

Wn. 

Lemma 5.4. If Wn is given by [24^ where Sn is the number of white balls 
added to Ua b after n time steps, then 



E[S\ 



nA 



+ 2 



n 



A{A + m) 



A + B \2j {A + B){A + B + m) 



EW„. 



and VariWn 



1. 



Proof. Since Wn,Wn is a Stein pair we have that EWn = by ([5]). Now, 
using the fact that if = Ij, and both parts of Lemma 15.31 we obtain 

E[Sl] = i?[(lo + ... + l„_i)2] 

''n-l 

4=0 0<h<j<n-l 

A{A + m) 



nA I n 
+ 2 



A + B \2j {A + B){A + B + m) 

yielding the first claim. 

Applying the expression for E[Sn] given by Lemma |5.3[ it follows that 

Y^rC^ ) - \ I -(""^ A{A + m) rfA^- 

^ - [a + B^ \2) {A + B){A + B + m) {A + B)^ 
AB{A + B + nm)n 



{A + B + m){A + By 
Hence, from the definition (1241) of W„ we conclude that 



^ AB{A + B + nm)n ^ ' 

We will apply Theorem 13.11 by writing Wn+i = V + T where 



□ 



(26) 
and 

(27) T 



V 



' {A + B + m){n + l) 
AB(A + B + nm + m) 



A 



[A + B + m) 



{A + B)Sn 
n + l 



{A + B)-lr^ 



V AB{A + B + nm + m){n + 1) 

and letting t = Sn- We note that the condition in Theorem 13.11 that V 
be r measurable is here clearly satisfied. The following lemma gives the 
properties of T needed for computing an bound using Theorem 13. 11 



Lemma 5.5. Let T be given by (27^ and t = Sn 
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(1) The conditional mean fir = E{T\t) is given by 



(A + B + m) {A + B){A + mSn 



a. 



V AB{A + B + nm + m){n + l) A + B + mn 

2) The conditional variance = E{{T — /Ut-)^|t) is given by 
{A + B + m){A + Bf {A + mSn){nm - mSn + B) 



AB{A + B + nm + m){n + l) {A + B + mny 

(3) The variance Var(T — fir) satisfies 

V (T- + 

^'^>- {n + l){A + B + nmy 

(4) The absolute deviation of T about fir satisfies 



E\T- fir\ < 



' {A + B + m){A + B)'^ 
AB{A + B + nm + m){n + 1) 



(5) The third order deviation ofT about fir, standardized by Var(T—fir), 
satisfies 



E\T-fir\^ ^ I {A + B + mf {A + B) 



Var{T - fir) ~ V AB{A + B + nm + m){n + l) AB 
Proof. Parts [U and [2] follow immediately from (I27|) and that 

A + mk 



(28) P{lj = l\Sj = k) 



A + B + mj 



for all j = 0, . . . , n — 1, A: = 0, . . . , J . 

For part ([3|), first note that as E{T — fir\T) = we have Var(T — fir) = 
E{T — fir)'^- Now again using ([28j) . we have that E (^{T — fir)'^\Sn) equals 



A + mSn \ ^ I 



A + B + nm 



/ {A + B + m){A + B)^ \ ^ 
\AB{A + B + nm + m){n + 1) ) 

/ {A + B + m){A + B)^ \ nA + mSn)iB + min-Sn)) 
\AB{A + B + nm + m){n + l)J \ ^A + B + nm)'^ 

Expanding the product {A + mSn){B + m{n — Sn)), taking expectation 
using the expressions for E[Sn] and -E[5'^] provided by Lemmas 15.31 and 1 5.4^ 
respectively, the claim follows after some simplification. 



ZERO BIASING AND GROWTH PROCESSES 



15 



For part HI one has that 
E\T - li. 



E[E\T - ^lr\\Sr, 



A + B + m 



■E 



AB{A + B + nm + m){n + I) 
A + mSn 



{A + B) 



A-\- B + nm 



Sn 



< 



A + B + m 



AB{A + B + nm + m){n + l 
The second equahty used ([25]) . and the inequahty that 

A + mSn 



{A + B). 



(29) 



E 



In 



Sr} 



< 1 for ah p > 



A + B + nm 
with p = 1. 

Now, for part [5l similarly, applying ([29|) with p = 3 we obtain 



E\T- 



E\E\T ■ 



fJ-T I I Sn] 

A + B + m 



-1 3/2 



AB{A + B + nm + m){n + l) 



{A + Bf 



■E 



A + mSn 



< 



A + B + nm 
A + B + m 



Sn 



AB{A + B + nm + m){n + l) 
Part [5] now follows from part [3] by division. 



3/2 



(A + BY 



□ 



Specializing ([9]) to the case at hand, with Mn.A,B{k) the distribution of 
Sn given by (j23p , we now consider constructing a coupling of Sn to a random 
variable 5° with distribution 



(30) 



□ 



(fc) 



-M 



n,A,_B 



(fc) 



where On+i is given by Lemma 15.21 The next result shows that one can 
achieve a variable with distribution 5"° by adding 2m additional balls to the 
urn at time zero, m white and m black, that is, by using the urn UA+m,B+m- 

Lemma 5.6. 

^n,A,B — Mn,A+m,B+m- 

Proof. From ^ of Lemma 15.51 and Lemma 15.21 we have 

al {A/m + B/m){A/m + B/m + l)[A/m + k){B/m + n-k) 



an+i {A/m){B/m){A/m + B/m + n){A/m + B/m + n + l) 
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Hence, for all G {0, . . . , n}, 

_ {A/m + B/m){A/m + B/m + l){A/m + k){B /m + n - k) 
~ {A/m){B/m){A/m + B/m + n){A/m + B/m + n + 1) 
n\ {A/m)k{B/m).a-k 
k) {A/m + B/m)n 
n\ {A/m + l)kiB/m+l\,-k 



kj {A/m + B/m + 2)n 

,A+m,B+m 

(k) 



□ 



Lemma \5M shows that for the process Sn on the urn 1{a,b, the process 
5° is for the urn UA+m,B+m- As for both processes no additional balls have 
been added at time zero, we have that 

(31) 5o = and Sq = 0. 

As at times n > 1 both of these chains increase by 1 when a white ball has 
been selected, if 5„ = /c and 5° = j, then Sn+i = k + 1 and S'^+i = j + 1 
with respective probabilities 

, . , . A + km □ A + m + jm 

(32) Sn{k) = . , D , and s„(j) 



A + B + mn A + B + 2m + mn 

We now couple 5„ and 5° by coupling, at each stage, the two Bernoulli 
variables that indicate the drawing of a white ball in each urn. In particular, 
we couple these two Bernoulli variables so that the chance they are not equal 
is minimized. 

Theorem 5.7. Let Sn{k) and s'^{j) be given by \3S\) for n,j,k G N. Then 
the bivariate chain taking values in N x N characterized by the initial con- 
dition {S(),Sq) = (0,0) and transitions 

Pn+l,n{u,v\kJ) = P{Sn+l = U, S°^i = v\Sn = k,S° = j) 

at times n > according to 

min(s„(A;),s°(j)) {u,v) = {k + l,j + 1) 

{s'^ij) - Sn{k))+ {u,v) = {k,j + l) 

(snik) - s'^ij))+ iu,v) = ik + l,j) 

1 - max(s„(/c),s°(j)) iu,v) = {k,j) 

is a coupling on a joint space of the urn models Ua,b o,nd hlA+m,B+m, f^- 
spectively. 

In addition, letting 

N = inf{n > 1 : 5„ / 5°} 

we have 

(33) |S^_5°| = l 



Pn+l,n{u,v\k,j) 
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and 

if 5^ = S]y + 1 then S° > 5„ for all n > 0, 
while, otherwise, 

if Sn = S'^ + 1 then S'„ > for all n > 0. 

Proof. That we must have (5o,5q ) = (0,0) is clear by ([3T]) . As marginally 
for Sn we have 

P{Sn+i = k + l\Sn = k) = min(s„(A;),s°(j)) + (s„(A;) - s°(j))+ = s„(/c), 

and similarly for 5°, both marginal transition functions agree with those 
specified by (|32p. hence the joint chain is a coupling of the two urn models 
in question. Further, since Sq = Sq, and at most one white ball is drawn 
from either of the two urns at each time n > 0, (|33p holds. 

Taking the difference between the probabilities of drawing a white ball 
from either of the two urns yields 



SnU) - Sn{k) 



(34) = m 



{A + mn){j - k-1) + B{j - k + 1) + 2m{n - k) 



{A + B + mn){A + B + 2m + mn) 

Suppose now that = Sn + 1- We show by induction that 5° > 5'„ + 1 for 
all n > N. Clearly the claim is true for n = N. Assume that 5*° > S'^ + 1 
for some n > N, say {Sn, S!^) = {k,j) with j — k > 1. Then, by we see 
that s°(j) > Sn{k), and hence {Sn+i, S^_^_l) equals {k + l,j + 1), {k,j + 1) 
or (fc,j) with respective probabilities Sn(^),s°(j) — Sn{k) and 1 — s°(j). In 
particular, S°_^_l > Sn+i + 1- 

As the same argument applies in the case Sn > 5° + 1 , and since 5° = Sn 
for all < n < by the definition of N, the final claim of the lemma is 
shown. □ 

We now compute a bound on E\Sn — Sn\ for the coupling provided by 
Theorem [5771 

Lemma 5.8. The joint chain (5.„,5'°) as specified in Theorem \5. ?| satisfies 

p|Q° o I ^ 2mn ^ f ^ A + B + m \ , ^ f A + B + m 
E\bn — Sn\ < , . ^ . 1 ( n > I + 1 ( n < 



A + B + m \ 2m J \ 2m 

Proof. By Theorem 15.71 with N as defined there, we have 

\S° - Sn\ = {S° - Sn)'i-{n>N,S°=SN+l} + ("^^ ~ ^n)'^{n>N,SN=S'^+l} ■ 
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For the first expectation, 



^ \i^n ~ Sn)'^{n>N,S°=SN+'\-} 



n-1 



Y,E[{S^-Sn)l 



{N=t,S°=SN + l} 



t=l 
n-1 



{N=t,S°=SN+l,SN=u} 



+ P{N = n, 5° = 5jv + 1) 

+ P{N = n,S'^ = SN + l) 



t=l u>0 
n-1 



Y,J2^^^n-Sn\N = t,S'^ = SN + l,SN=u) 
t=l u>0 

■P{N = t,S'^ = SN + 1, Sn = u) + P{N = 71,3^^ = Sn + 1). 



For 1 < t < n — 1, on the conditioning event, urn Ua,b has A + mu white 
balls and B + mt — mu black balls at time t, and then has been run for 
time n — t. At each of these time steps, by Lemma 15.31 there is probability 
(A + mu)/{A + B + mt) that a white ball will be selected from urn Ua,b- 

Similarly, for 1 < t < n — 1, on the conditioning event, urn UA+m,B+m 
has A + m + (mu + m) = A + mu + 2m white balls and B + m + mt — 
(mu + m) = B + mt — mu black balls at time t, and then has been run for 
time n — t. At each of these time steps, by Lemma 15.31 the probability is 
(A + mu + 2m) /{A + B + mt + 2m) that a white ball is selected from urn 

m,B-\-m' 

Hence, as it may be that all the balls chosen from Ua,b before time are 
black, that is, we may have = u ior u = 0, we have 



E{S°-Sn\N 
= (n-t) 



< 



< 



< 



{n-t) 
2m{n — t) 



= t,S^ = Sn + 1, Sn 

A + mu + 2m 
A + B + mt + 2m 
2m(B + mt 



= u) 

A + mu 
A + B + mt 
- mu) 



2m(n 



{A + B + mt){A + B + mt + 2m) 
(B + mt) 
{A + B + mt){A + B + mt + 2m] 
t){B + mt) 



{A + B + mt)'^ 
2m{n — t) 



A + B + mt' 
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Therefore 

E \{S^ - Sn)'^{n>N,S°=SN + l} 

t=l M>0 

+P{N = n,S'^ = Sn + 1) 



E aZ^UIA ^ = = + l)+P{N = n,5^ = 5^ + 1). 



t=i 



Reversing the roles of S„ and 5„ , though here noting that it is necessary 
that M < t — 1 for the event {A^ = t, Sn = -S*^ + 1, = u} to have positive 
probabiUty, we similarly obtain 



E 



\{Sn - 5'°)l{ri>Ar,5]v=5°+l} 



^ E i7i? + mt ^^^ = t, 5^ = 5^ + 1) + P(iV = n, 5,v = 5^ + 1). 
t=i 

Now using that {n — t)/{A + B + mt) is a decreasing function of of t > 0, 
summing yields 

n—1 

< 2m V P(N = t) + P(N = n) 

^ A + B + mt ^ ' ^ ' 
t=i 

'^TD 71 

< , P(N <n-l) + P(N = n) 

2mn ( A + i? + m\ / A^B 

< -. 1 ra> ^ +1 n< 



A + B + m \ 2m J \ 2m 

as claimed, where in the final inequality we have used the fact that since 
a+P < 1 for a = P[N < n—1) and (3 = P{N = n), then for any nonnegative 
numbers a and b we have aa + f3b < max(a, b). 

□ 

Proof of Theorem \5.1\ That EWn = and Var(W„) = 1 is the content of 
Lemma 15.41 

We now compute the bound using Theorem 13.11 Applying ([1]) of 
Lemma 15.51 with r° and r we obtain 



{A + B + m){A + BY {A + m\S^-Sn 



AB{A + B + nm + m){n + l) A + B + mn 



< 



I {A + B + m){A + B)^ 
AB{A + B + nm + m){n + 1) 
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since both Sn and 5*° take values between and n. 
Applying the definition (|26p of V on r° and r, 



W -V\= {A + B + m){A + By , od _ o , 

' "° ' Y + + + + 

so that for n > {A + B + m)/2m we have 



r.,,^ 2mn / {A + B + m){A + Bf 

hi I i/^D — V \ ~ 



A + B + mV AB(A + B + nm + m){n + I)' 
while for n < {A + B + m) /2m, 



E\Vr° - V\ 



' (^ + 5 + m)(A + 5)2 
AB{A + B + nm + m){n + I) 



The calculation is completed by using ([4]) and ([5]) of Lemma 15.51 for the 
final two terms, and then applying the inequality {A + B A- m){A + B)"^ < 
(A + B + m)^. a 
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